Rule - Extraction from Support Vector Machines
نویسنده
چکیده
Over the last years, a number of studies on rule extraction from support vector machines (SVMs) have been introduced [1-5]. The research strategy in these projects is similar: to explore and develop algorithms for rule-extraction from SVMs based on the perception (or “view”) of the underlying SVM which is either explicitly or implicitly assumed within the rule extraction technique. In the context of rule-extraction from artificial neural networks [6, 7] the notion of “translucency” describes the degree to which the internal representation of the ANN is accessible to the rule extraction technique. More broadly, a taxonomy for rule-extraction from neural networks has been introduced [6, 7] which includes five evaluation criteria: translucency, rule quality, expressive power, portability and algorithmic complexity. These evaluation criteria are now commonly used for ruleextraction from SVMs. The central thesis here is that the above mentioned evaluation criteria cannot be applied to rule-extraction from SVMs, in particular those trained on very high-dimensional data, and that SVMs that generate structured output [8, 9] offer opportunities for ruleextraction, and hence, the explanation of learning results. The following briefly describes two of the five evaluation criteria for rule-extraction from neural networks which are then discussed in the context of rule-extraction from SVMs. A new classification schema for rule-extraction from SVMs is presented and an approach is outlined which (1) uses SVMs only, including those generating structured output, and (2) works well for high-dimensional data. Rule-extraction from ANNs At one end of the translucency spectrum are those rule extraction techniques which view the underlying ANN at the maximum level of granularity i.e. as a set of hidden and output units (“decompositional” techniques [10]). The basic strategy of such decompositional rule extraction techniques is to extract rules at the level of each individual hidden and output unit. In contrast to the decompositional approaches, the strategy of “pedagogical” or learning-based methods is to view the trained ANN at the minimum possible level of granularity i.e. as a single entity or alternatively as a “black box”. The focus is then on finding rules that map the ANN inputs (i.e. the attribute/value pairs from the 2 problem domain) directly to outputs (e.g. membership-of or exclusion-from some target class [7]). A rule set is considered to be accurate if it can correctly classify a set of previously unseen examples and displays a high level of fidelity if it can mimic the behaviour of the neural network from which it was extracted by capturing all of the information represented in the ANN. An extracted rule set is consistent if, under differing training sessions, the artificial neural network generates rule sets which produce the same classifications of unseen examples. Finally, the comprehensibility of a rule set is determined by measuring the size of the rule set (in terms of the number of rules) and the number of antecedents per rule [7]. Translucency and rule quality applied to rule extraction from SVMs. Most current studies on rule-extraction from SVMs focus on decompositional extraction, however, learning-based approaches are also available [4]. The idea is simple: combine SVM outputs with inputs from a data set and use any machine learning technique that produces rules or decision trees. Hence, pedagogical rule-extraction from SVMs is trivial, in particular if the data set is low dimensional. To date, there is no rule-extraction from SVM technique for high-dimensional data, i.e. the core application domain for SVMs. Rule-extraction from support vector machines requires evaluation criteria that emphasize data. The following dimensions are proposed: (1) translucency, (2) dimensionality of data, (3) expressiveness of the extracted rules, (4) portability, (5) rule quality and (6) algorithmic complexity of extraction. The following technique utilizes a multi-class SVM [9] and scores high on several of the criteria mentioned above. Assume two SVMs, here labeled as CSVM (classification SVM) and ESVM (explanation SVM). CSVM is trained on a binary decision problem with high-dimensional data, e.g. text classification, ESVM on a multiple classification task. ESVM takes as input the output of CSVM plus the original input pattern. ESVM’s target categories represent user selected features from the CSVM input pattern plus ranges over the values of these attributes. In a text classification task, for instance, ESVM’s output classes represent content words and the frequency of their occurrence. Both the CSVM and ESVM outputs can be used to form conjunctive rules: ESVM outputs are the set of antecedents and the CSVM output is the consequence. The entire rule set is then refined: duplicates as well as redundant rules and antecedents are eliminated. This method is simple and purely learning-based, works for high-dimensional data and generates propositional rules. The technique is portable, however, fidelity is not guaranteed and large rule sets are possible. This is offset by the fact that the method is very efficient (rule generation is based on the CSVM/ESVM output only) and explanation for individual input patterns is possible. The objective of future research is to use SVMs that generate structured outputs directly for the generation of rule sets. A structured model is a scoring scheme over a set of combinatorial structures plus a method for finding the highest ranking structure [8]. Hence, it should be possible to learn rule sets directly that offer best explanation for an SVM learning result.
منابع مشابه
Fuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring
There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...
متن کاملA QUADRATIC MARGIN-BASED MODEL FOR WEIGHTING FUZZY CLASSIFICATION RULES INSPIRED BY SUPPORT VECTOR MACHINES
Recently, tuning the weights of the rules in Fuzzy Rule-Base Classification Systems is researched in order to improve the accuracy of classification. In this paper, a margin-based optimization model, inspired by Support Vector Machine classifiers, is proposed to compute these fuzzy rule weights. This approach not only considers both accuracy and generalization criteria in a single objective fu...
متن کاملFuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring
There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...
متن کاملRule extraction from Support Vector Machines : a geometric approach
This paper presents a new approach to rule extraction from Support Vector Machines. SVMs have been applied successfully in many areas with excellent generalization results; rule extraction can offer explanation capability to SVMs. We propose to approximate the SVM classification boundary through querying followed by clustering, searching and then to extract rules by solving an optimization prob...
متن کاملEclectic Rule - Extraction from Support Vector Machines
superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation of SVMs is the lack of an explanation capability which is crucial in some applications, e.g. in the medical and security domains. In this paper, a novel approach for eclectic rule-extraction from support vector machines is presented. This approach utilizes the knowledge ...
متن کاملLearning-based Rule-Extraction from Support Vector Machines
In recent years, support vector machines (SVMs) have shown good performance in a number of application areas, including text classification. However, the success of SVMs comes at a cost – an inability to explain the process by which a learning result was reached and why a decision is being made. Rule-extraction from SVMs is important for the acceptance of this machine learning technology, espec...
متن کامل